[Klaud Cold] Add minimaxm3-fp4-mi355x-atom (upstream branch for full-sweep validation) by andyluo7 · Pull Request #1813 · SemiAnalysisAI/InferenceX

andyluo7 · 2026-06-17T20:27:12Z

Upstream-branch mirror of #1812 (originally from indianspeedster/InferenceMAX:feat/minimaxm3-fp4-mi355x-atom) so the GPU full-sweep PR validation can run — fork PRs can't access the self-hosted runners/secrets that run-sweep.yml needs.

Original author: @indianspeedster. Mirrors PR #1812 at commit a68c303 (includes the Bugbot fix "use matrix MAX_MODEL_LEN").

Summary

Adds the minimaxm3-fp4-mi355x-atom config — MiniMax-M3 MXFP4 (amd/MiniMax-M3-MXFP4) on MI355X, single-node atom engine — for the 1k/1k and 8k/1k fixed-seq-len cells, TP4. Follows the ROCm/ATOM MiniMax-M3 recipe (FP4 on 4×MI355 section).

.github/configs/amd-master.yaml: new config entry + search space (TP4, conc 1→128, image rocm/atom-dev:M3).
benchmarks/single_node/fixed_seq_len/minimaxm3_fp4_mi355x_atom.sh: atom serve script — --block-size 128 (mandatory for MiniMax MSA), --gpu-memory-utilization 0.8, --trust-remote-code, --max-model-len $MAX_MODEL_LEN. KV cache left at default dtype (MXFP4 checkpoint ships no calibrated FP8 KV scales).
runners/launch_mi355x-amds.sh: route amd/MiniMax-M3* weights to the NFS cache.
perf-changelog entry.

Validation

generate_sweep_configs.py test-config → 16 configs (minimaxm3_1k1k, minimaxm3_8k1k, TP4 conc 1–128).
Smoke-tested on real MI355X (TP4 / conc-1 / 1k1k): atom server came up across 4 ranks, served, wrote a well-formed result JSON.

Adding full-sweep-enabled to run the full PR validation sweep.

Closes/supersedes #1812 once validated.

Note

Low Risk
Benchmark and CI config only; no changes to auth, data handling, or production serving paths.

Overview
Adds a day-zero minimaxm3-fp4-mi355x-atom sweep for MiniMax-M3 MXFP4 (amd/MiniMax-M3-MXFP4) on MI355X using the ATOM engine, aligned with the ROCm/ATOM MiniMax-M3 recipe (TP4, --block-size 128 for MSA).

.github/configs/amd-master.yaml defines fixed-seq-len cells at 1k/1k and 8k/1k with TP4 and concurrency 1→128, image rocm/atom-dev:M3.

benchmarks/single_node/fixed_seq_len/minimaxm3_fp4_mi355x_atom.sh starts atom.entrypoints.openai_server with matrix MAX_MODEL_LEN, mandatory block size 128, and default KV cache dtype (no FP8 KV — the MXFP4 checkpoint has no calibrated scales). Optional eval and standard serving benchmark follow.

runners/launch_mi355x-amds.sh routes amd/MiniMax-M3* weights to the NFS Hugging Face cache, same as existing MiniMaxAI M3 paths.

perf-changelog.yaml documents the new config key.

^{Reviewed by Cursor Bugbot for commit c251a03. Bugbot is set up for automated code reviews on this repo. Configure here.}

Smoke-tested on MI355X (mia1-p01-g07): TP4 conc-1 1k1k served and benched clean (mean TPOT 6.8ms). KV cache left at default dtype — amd/MiniMax-M3-MXFP4 has no calibrated FP8 KV scales, so --kv_cache_dtype fp8 asserts in the MSA fused_qknorm kernel.

…#1813

andyluo7 · 2026-06-17T20:37:08Z

/sweep test-config --config-keys minimaxm3-fp4-mi355x-atom --config-files .github/configs/amd-master.yaml

github-actions · 2026-06-17T20:37:27Z

@andyluo7 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/27718177974
Command: test-config --config-keys minimaxm3-fp4-mi355x-atom --config-files .github/configs/amd-master.yaml
Pinned ref: 61a6a94
Approval: not required (trusted collaborator).

github-actions · 2026-06-18T03:53:48Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27732861480
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27732861480

seungrokj

LGTM

seungrokj · 2026-06-18T03:54:23Z

/reuse-sweep-run

seungrokj · 2026-06-18T03:54:27Z

/merge-prs

seungrokj · 2026-06-18T03:54:40Z

@functionstackx @cquil11 can you plz approve this ?

Oseltamivir

lgtm

indianspeedster added 4 commits June 17, 2026 19:16

minimaxm3-fp4-mi355x-atom: route amd/MiniMax-M3* weights to NFS cache

2d7158a

minimaxm3-fp4-mi355x-atom: fill perf-changelog pr-link

9a2b0f4

minimaxm3-fp4-mi355x-atom: use matrix MAX_MODEL_LEN (isl+osl+256)

a68c303

andyluo7 requested a review from a team June 17, 2026 20:27

andyluo7 requested review from 1am9trash, billishyahao, chunfangamd, seungrokj and yctseng0211 as code owners June 17, 2026 20:27

github-project-automation Bot added this to InferenceMAX Board Jun 17, 2026

andyluo7 added full-sweep-enabled and removed full-sweep-enabled labels Jun 17, 2026

trigger full-sweep validation

771ed59

andyluo7 marked this pull request as draft June 17, 2026 20:32

andyluo7 marked this pull request as ready for review June 17, 2026 20:33

perf-changelog: point minimaxm3-fp4-mi355x-atom pr-link to upstream PR …

61a6a94

…#1813

andyluo7 mentioned this pull request Jun 17, 2026

[Klaud Cold] Add minimaxm3-fp4-mi355x-atom #1812

Closed

seungrokj added the AMD label Jun 18, 2026

Merge branch 'main' into feat/minimaxm3-fp4-mi355x-atom

c251a03

seungrokj approved these changes Jun 18, 2026

View reviewed changes

Oseltamivir approved these changes Jun 18, 2026

View reviewed changes

seungrokj merged commit cc78fc9 into main Jun 18, 2026
40 checks passed

seungrokj deleted the feat/minimaxm3-fp4-mi355x-atom branch June 18, 2026 04:06

github-project-automation Bot moved this to Done in InferenceMAX Board Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Klaud Cold] Add minimaxm3-fp4-mi355x-atom (upstream branch for full-sweep validation)#1813

[Klaud Cold] Add minimaxm3-fp4-mi355x-atom (upstream branch for full-sweep validation)#1813
seungrokj merged 7 commits into
mainfrom
feat/minimaxm3-fp4-mi355x-atom

andyluo7 commented Jun 17, 2026 •

edited by cursor Bot

Loading

Uh oh!

andyluo7 commented Jun 17, 2026

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

seungrokj left a comment

Uh oh!

seungrokj commented Jun 18, 2026

Uh oh!

seungrokj commented Jun 18, 2026

Uh oh!

seungrokj commented Jun 18, 2026

Uh oh!

Oseltamivir left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

andyluo7 commented Jun 17, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

andyluo7 commented Jun 17, 2026

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

seungrokj left a comment

Choose a reason for hiding this comment

Uh oh!

seungrokj commented Jun 18, 2026

Uh oh!

seungrokj commented Jun 18, 2026

Uh oh!

seungrokj commented Jun 18, 2026

Uh oh!

Oseltamivir left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andyluo7 commented Jun 17, 2026 •

edited by cursor Bot

Loading